Search CORE

30 research outputs found

Towards plant pangenomics

Author: Batley Jacqueline
Edwards David
Golicz Agnieszka A.
Publication venue: 'Wiley'
Publication date: 01/04/2016
Field of study

As an increasing number of genome sequences become available for a wide range of species, there is a growing understanding that the genome of a single individual is insufficient to represent the gene diversity within a whole species. Many studies examine the sequence diversity within genes, and this allelic variation is an important source of phenotypic variation which can be selected for by man or nature. However, the significant gene presence/absence variation that has been observed within species and the impact of this variation on traits is only now being studied in detail. The sum of the genes for a species is termed the pangenome, and the determination and characterization of the pangenome is a requirement to understand variation within a species. In this review, we explore the current progress in pangenomics as well as methods and approaches for the characterization of pangenomes for a wide range of plant species

University of Queensland eSpace

Grain dispersal mechanism in cereals arose from a genome duplication followed by changes in spatial expression of genes involved in pollen development

Author: Cross Arthur
Golicz Agnieszka A.
Li John B.
Pourkheirandish Mohammad
Waugh Robbie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

KEY MESSAGE: Grain disarticulation in wild progenitor of wheat and barley evolved through a local duplication event followed by neo-functionalization resulting from changes in location of gene expression. ABSTRACT: One of the most critical events in the process of cereal domestication was the loss of the natural mode of grain dispersal. Grain dispersal in barley is controlled by two major genes, Btr1 and Btr2, which affect the thickness of cell walls around the disarticulation zone. The barley genome also encodes Btr1-like and Btr2-like genes, which have been shown to be the ancestral copies. While Btr and Btr-like genes are non-redundant, the biological function of Btr-like genes is unknown. We explored the potential biological role of the Btr-like genes by surveying their expression profile across 212 publicly available transcriptome datasets representing diverse organs, developmental stages and stress conditions. We found that Btr1-like and Btr2-like are expressed exclusively in immature anther samples throughout Prophase I of meiosis within the meiocyte. The similar and restricted expression profile of these two genes suggests they are involved in a common biological function. Further analysis revealed 141 genes co-expressed with Btr1-like and 122 genes co-expressed with Btr2-like, with 105 genes in common, supporting Btr-like genes involvement in a shared molecular pathway. We hypothesize that the Btr-like genes play a crucial role in pollen development by facilitating the formation of the callose wall around the meiocyte or in the secretion of callase by the tapetum. Our data suggest that Btr genes retained an ancestral function in cell wall modification and gained a new role in grain dispersal due to changes in their spatial expression becoming spike specific after gene duplication. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00122-022-04029-8

PubMed Central

University of Dundee Online Publications

The pangenome of hexaploid bread wheat

Author: Batley Jacqueline
Bayer Philipp E.
Chan Chon-Kit Kenneth
Doležel Jaroslav
Edwards David
Golicz Agnieszka A.
Hurgobin Bhavna
Lai Kaitao
Lee HueyTyng
Montenegro Juan D.
Muhindira Paul
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

There is an increasing understanding that variation in gene presence–absence plays an important role in the heritability of agronomic traits; however, there have been relatively few studies on variation in gene pres- ence–absence in crop species. Hexaploid wheat is one of the most important food crops in the world and intensive breeding has reduced the genetic diversity of elite cultivars. Major efforts have produced draft genome assemblies for the cultivar Chinese Spring, but it is unknown how well this represents the genome diversity found in current modern elite cultivars. In this study we build an improved reference for Chinese Spring and explore gene diversity across 18 wheat cultivars. We predict a pangenome size of 140 500 102 genes, a core genome of 81 070 1631 genes and an average of 128 656 genes in each cultivar. Functional annotation of the variable gene set suggests that it is enriched for genes that may be associated with important agronomic traits. In addition to variation in gene presence, more than 36 million intervarietal sin- gle nucleotide polymorphisms were identified across the pangenome. This study of the wheat pangenome provides insight into genome diversity in elite wheat as a basis for genomics-based improvement of this important crop. A wheat pangenome, GBrowse, is available at http://appliedbioinformatics.com.au/cgi-bin/ gb2/gbrowse/WheatPan/, and data are available to download from http://wheatgenome.info/wheat_ge nome_databases.php

Crossref

Greenwich Academic Literature Archive

Queensland University of Technology ePrints Archive

Western Sydney ResearchDirect

University of Melbourne Institutional Repository

University of Queensland eSpace

Assembly and comparison of two closely related Brassica napus genomes

Author: Bancroft Ian
Batley Jacqueline
Bayer Philippe E
Chalhoub Boulos
Chan Chon-Kit Kenneth
Edwards David
Golicz Agnieszka A
Hurgobin Bhavna
King Graham J.
Lee HueyTyng
Li Ruiyuan
Long Yan
Meng Jinling
Renton Michael
Yuan Yuxuan
Zou Jun
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here, we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the Brassica napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in future production of a B. napus pangenome

HAL Evry

Crossref

ZENODO

White Rose Research Online

ProdInra

University of Melbourne Institutional Repository

University of Queensland eSpace

FigShare

The Genome of a Southern Hemisphere Seagrass Species ( Zostera muelleri

Author: Agnieszka A. Golicz
Andrew H. Paterson
Anthony W.D. Larkum
Chon-Kit Kenneth Chan
David Edwards
Gary A. Kendrick
Gaurav Sablok
Haibao Tang
HueyTyng Lee
Jacqueline Batley
Peter J. Ralph
Philipp E. Bayer
Rahul R. Krishnaraj
Yuannian Jiao
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date
Field of study

Crossref

An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome

Author: A Gurevich
A McKenna
Agnieszka Golicz
Andrew J. Flavell
Antonio Ribeiro
B Langmead
B Nystedt
C Otto
Christine Anne Hackett
David Marshall
DR Zerbino
DR Zerbino
FJ Ribeiro
Gordon Stephen
H Li
H Li
I Milne
I Milne
Iain Milne
J Dou
JP Hamilton
K Bradnam
K Lai
MA DePristo
Micha Bayer
MJ Chaisson
N You
NA Fonseca
PA Morin
PY Liao
R Nielsen
R Payne
S Gnerre
S Kumar
SF Altschul
TC Glenn
The Arabidopsis Genome Initiative
TIBGSC IBGSC
Z Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling - quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive. Results: The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases. Conclusions: The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration

Crossref

Springer - Publisher Connector

PubMed Central

University of Dundee Online Publications

University of Queensland eSpace

An efficient approach to BAC based assembly of complex genomes

Author: A D’Hont
Agnieszka A. Golicz
AH Paterson
Bhavna Hurgobin
C Alkan
C Soderlund
Chon-Kit Kenneth Chan
David Edwards
DR Kelley
H Šimková
H-B Zhang
Hana Šimková
Helena Staňková
JA Poland
Jacqueline Batley
Jaroslav Doležel
JP Tomkins
JP Tomkins
Juan Montenegro
KF Au
M Boetzer
M Chen
M Martin
M Mascher
M-C Luo
N Park
P Ruperao
Paul J. Berkman
Paul Visendi
Philipp E. Bayer
PJ Berkman
Pradeep Ruperao
PS Schnable
R Brenchley
R Kajitani
R Li
RA Martienssen
S Kurtz
S McGinnis
S Taudien
Satomi Hayashi
SH Kazakoff
SL Salzberg
SR Choi
T Eilam
T Wicker
T Yin
VL Chandler
WS Cleveland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate ‘gold’ reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. Results: We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. Conclusions: We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes

Crossref

Greenwich Academic Literature Archive

Springer - Publisher Connector

Queensland University of Technology ePrints Archive

PubMed Central

University of Melbourne Institutional Repository

University of Queensland eSpace

The giant diploid faba genome unlocks variation in a global protein crop

Author: Andersen Stig Uggerhøj
Angra Deepti
Aubert Grégoire
Bednář Petr
Bornhofen Elesandro
Boussageon Raphaël
Cheung Kwok
Courty Pierre Emmanuel
Doležel Jaroslav
Fechete Lavinia I.
Golicz Agnieszka A.
Gundlach Heidrun
Hallab Asis
Himmelbach Axel
Holm Liisa U.
Imbert Baptiste
Janss Luc L.
Jayakodi Murukarthick
Kaur Sukhjiwan
Keeble-Gagnère Gabriel
Khazaei Hamid
Koblížková Andrea
Kobrlová Lucie
Krejčí Petra
Kreplak Jonathan
Macas Jiří
Mascher Martin
Mouritzen Troels W.
Nadzieja Marcin
Neumann Pavel
Nielsen Linda Kærgaard
Novák Petr
Orabi Jihad
O’Sullivan Donal Martin
Padmarasu Sudharsan
Robertson-Shersby-Harvie Tom
Robledillo Laura Ávila
Schiemann Andrea
Schubert Ingo
Schulman Alan H.
Smýkal Petr
Snowdon Rod J.
Stein Nils
Stoddard Frederick L.
Stougaard Jens
Tanskanen Jaakko
Tayeh Nadim
Torres Ana M.
Törönen Petri
Usadel Björn
Warsame Ahmed O.
Wittenberg Alexander H.J.
Zhang Hailin
Čížková Jana
Publication venue
Publication date: 01/01/2023
Field of study

Publisher Copyright: © 2023, The Author(s).Increasing the proportion of locally produced plant protein in currently meat-rich diets could substantially reduce greenhouse gas emissions and loss of biodiversity1. However, plant protein production is hampered by the lack of a cool-season legume equivalent to soybean in agronomic value2. Faba bean (Vicia faba L.) has a high yield potential and is well suited for cultivation in temperate regions, but genomic resources are scarce. Here, we report a high-quality chromosome-scale assembly of the faba bean genome and show that it has expanded to a massive 13 Gb in size through an imbalance between the rates of amplification and elimination of retrotransposons and satellite repeats. Genes and recombination events are evenly dispersed across chromosomes and the gene space is remarkably compact considering the genome size, although with substantial copy number variation driven by tandem duplication. Demonstrating practical application of the genome sequence, we develop a targeted genotyping assay and use high-resolution genome-wide association analysis to dissect the genetic basis of seed size and hilum colour. The resources presented constitute a genomics-based breeding platform for faba bean, enabling breeders and geneticists to accelerate the improvement of sustainable protein production across the Mediterranean, subtropical and northern temperate agroecological zones.Peer reviewe

Central Archive at the University of Reading

Jukuri

Juelich Shared Electronic Resources

Helsingin yliopiston digitaalinen arkisto

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Skim-based genotyping by sequencing

Author: Bayer Philipp E.
Edwards David
Golicz Agnieszka A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Genotyping by sequencing (GBS) is a relatively new method used to determine the differences in the genetic makeup of individuals. Its novelty stems from a combination of two already available methods: genotyping and next-generation sequencing. Depending on the individual study design GBS protocols can take multiple forms, however most share a sequence of core steps that have to be undertaken. These include: sequencing of the DNA from the individuals of interest (usually two parents of a mapping population and their progeny), mapping of the sequencing reads to the reference sequence, SNP calling and filtering, SNP genotyping and imputation, followed by haplotype identification and downstream analysis. GBS has a range of applications from general marker discovery, haplotype identification, and recombination characterization to quantitative trait locus (QTL) analysis, genome-wide association studies (GWAS), and genomic selection (GS). It has already been applied to a range of plant species including: rice, maize, artichoke, and Arabidopsis thaliana. It is a promising approach which is likely to provide new and important insights into plant biology

University of Queensland eSpace

Genome-wide analysis of the Hsf gene family in Brassica oleracea and a comparative analysis of the Hsf gene family in B. oleracea, B. rapa and B. napus

Author: Bhalla Prem L.
Golicz Agnieszka A.
Lohani Neeta (R20767)
Singh Mohan B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The global climate change-induced abiotic and biotic stresses are predicted to affect crop-growing seasons and crop yield. Heat stress transcription factors (Hsfs) have been suggested to play a significant role in various stress responses. They are an integral part of the signal transduction pathways that operate in response to environmental stresses. Brassica oleracea is one of the agronomical important crop species which consists of cabbage, cauliflower, broccoli, Brussels sprout, kohlrabi and kale. The identification and roles of Hsfs in this important Brassica species are unknown. The availability of whole genome sequence of B. oleracea provides us an opportunity for performing in silico analysis of Hsf genes in B. oleracea. Thirty-five putative genes encoding Hsf proteins were identified and classified into A, B and C classes. Their evolution, physical location, gene structure, domain structure and tissue-specific expression patterns were investigated. Further, a comparative analysis of the Hsf gene family in B. oleracea, B. rapa and B. napus highlighted the role of hybridisation and allopolyploidy in the evolution of the largest known Hsf gene family in B. napus. The presence of orthologous gene clusters, found in Brassica species, but not in A. thaliana, suggested that polyploidisation has resulted in the formation of new Brassica-specific orthologous gene clusters. Gene duplication analysis indicated that the evolution of the Hsf gene family was under strong purifying selection in these Brassica species. High-level synteny was observed within the B. napus genome. Conservation of physical location, the similarity of structure and similar expression profiles between the B. napus Hsf genes and the corresponding genes from B. oleracea and B. rapa suggest a high functional similarity between these genes. This study paves the way for further investigation of Hsf genes in improving stress tolerance in B. oleracea. The genes thus identified may be useful for developing crop varieties resilient to the global climate change

Western Sydney ResearchDirect